Tradeoffs in Measuring Entity Similarity for Pattern Detection in OWL Ontologies
نویسندگان
چکیده
Syntactic regularities are repetitive structures in the asserted axioms of an ontology represented as generalisations, which are axioms with variables. The Regularity Inspector for Ontologies (RIO) is a framework for detecting such regularities in ontologies. Established clustering techniques are applied to the signature of the ontology to detect clusters of similar entities. Clustering depends on pairwise entity distances, which determine the similarity of two entities. In this paper we present three variations on similarity definition that affect pairwise distances and thus the regularities detected. Our analysis explores and compares methods that capture regularities of different granularity; in particular we analyse commonalities and differences between the generalisations and clusters that result from the three variations of similarity and check if they capture dominant patterns in the ontology in the same way. We perform the analysis using the BioPortal corpus and we discuss the tradeoffs of each similarity function.
منابع مشابه
Measuring Similarity of Elements in OWL DL Ontologies
OWL becomes nowadays a more and more widely-used language for representing ontologies. The number of OWL ontologies increasing in direct ratio to the development of the Semantic Web leads to the heterogeneity problem. The same concepts may be modeled differently, using different terms and different positions in concept hierarchy. The task of identifying similar entities (concepts, relations or ...
متن کاملSimilarity-Based Ontology Alignment in OWL-Lite
Interoperability of heterogeneous systems on the Web will be admittedly achieved through an agreement between the underlying ontologies. However, the richer the ontology description language, the more complex the agreement process, and hence the more sophisticated the required tools. Among current ontology alignment paradigms, similarity-based approaches are both powerful and flexible enough fo...
متن کاملTowards Ontology Matching via Pattern-Based Detection of Semantic Structures in OWL Ontologies
Ontology Matching is nowadays a vivid area of Computer Science. There are several OM tools looking for correspondences between entities of ontologies. These correspondences are usual simple equivalence mapping pairs classto-class or property-to-property. In our work we concentrate on diverse kinds of semantic structures in ontologies in terms of their detection and mutual matching. For this kin...
متن کاملOLA in the OAEI 2005 Alignment Contest
Among the variety of alignment approaches (e.g., using machine learning, subsumption computation, formal concept analysis, etc.) similarity-based ones rely on a quantitative assessment of pair-wise likeness between entities. Our own alignment tool, OLA, features a similarity model rooted in principles such as: completeness on the ontology language features, weighting of different feature contri...
متن کاملA New Method for Duplicate Detection Using Hierarchical Clustering of Records
Accuracy and validity of data are prerequisites of appropriate operations of any software system. Always there is possibility of occurring errors in data due to human and system faults. One of these errors is existence of duplicate records in data sources. Duplicate records refer to the same real world entity. There must be one of them in a data source, but for some reasons like aggregation of ...
متن کامل